<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
        <title>RVM : Using GCSpy</title>
	    <link rel="stylesheet" href="styles/site.css" type="text/css" />
        <META http-equiv="Content-Type" content="text/html; charset=UTF-8">	    
    </head>

    <body>
	    <table class="pagecontent" border="0" cellpadding="0" cellspacing="0" width="100%" bgcolor="#ffffff">
		    <tr>
			    <td valign="top" class="pagebody">
				    <div class="pageheader">
					    <span class="pagetitle">
                            RVM : Using GCSpy
                                                    </span>
				    </div>
				    <div class="pagesubheading">
					    This page last changed on Mar 07, 2007 by <font color="#0050B2">pdonald</font>.
				    </div>

				    <h2><a name="UsingGCSpy-TheGCspyHeapVisualisationFramework"></a>The GCspy Heap Visualisation Framework</h2>

<p>GCspy is a visualisation framework that allows developers to observe the behaviour of the heap and related data structures. For details of the GCspy model, see <a href="http://www.cs.kent.ac.uk/pubs/2002/1426/">GCspy: An adaptable heap visualisation frameworkby Tony Printezis and Richard Jones, OOPSLA'02</a>.   The framework comprises two components that communicate across a socket: a <em>client</em> and a <em>server</em>incorporated into the virtual machine of the system being visualised. The client is usually a visualiser (written in Java) but the framework also provides other tools (for example, to store traces in a compressed file). The GCspy server implementation for JikesRVM was contributed by Richard Jones of the University of Kent.</p>

<p>GCspy is designed to be independent of the target system. Instead, it requires the GC developer to describe their system in terms of four GCspy abstractions, <em>spaces, streams, tiles</em> and <em>events</em>. This description is transmitted to the visualiser when it connects to the server.</p>

<p>A <em>space</em> is an abstraction of a component of the system; it may represent a memory region, a free-list, a remembered-set or whatever. Each space is divided into a number of blocks which are represented by the visualiser as <em>tiles</em>.  Each space will have a number of attributes &#45;&#45; <em>streams</em> &#45;&#45; such as the amount of space used, the number of objects it contains, the length of a free-list and so on.</p>

<p>In order to instrument a Jikes RVM collector with GCspy:</p>
<ol>
	<li>Provide a <tt>startGCspyServer</tt> method in that collector's plan. That method initialises the GCspy server with the port on which to communicate and a list of event names, instantiates drivers for each space, and then starts the server.</li>
	<li>Gather data from each space for the tiles of each stream (e.g. before, during and after each collection).</li>
	<li>Provide a driver for each space.</li>
</ol>


<p><em>Space drivers</em> handle communication between collectors and the GCspy infrastructure by mapping information collected by the memory manager to the space's streams. A typical space driver will:</p>
<ul>
	<li>Create a GCspy <em>space</em>.</li>
	<li>Create a <em>stream</em> for each attribute of the space.</li>
	<li>Update the tile statistics as the memory manager passes it information.</li>
	<li>Send the tile data along with any summary or control information to the visualiser.</li>
</ul>


<p>The Jikes RVM SSGCspy plan gives an example of how to instrument a collector. It provides GCspy spaces, streams and drivers for the semi-spaces, the immortal space and the large object space, and also illustrates how performance may be traded for the gathering of more detailed information.</p>

<h2><a name="UsingGCSpy-InstallationofGCspywithJikesRVM"></a>Installation of GCspy with Jikes RVM</h2>


<h3><a name="UsingGCSpy-SystemRequirements"></a>System Requirements</h3>

<p>The GCspy C server code needs a pthread (created in <tt>gcspyStartserver()</tt> in <tt>sys.C</tt>) in order to run. So, GCspy will only work on a system where you've build Jikes RVM with <tt>config.single.virtual.processor</tt> set to <tt>0</tt>. The build process will fail if you try to configure such a build.</p>

<h4><a name="UsingGCSpy-BuildingGCSpy"></a>Building GCSpy</h4>

<p>The GCspy client code makes use of the Java Advanced Imaging (JAI) API. The build system will attempt to download and install the JAI component when required but this is only supported on the <tt>ia32-linux</tt> platform. The build system will also attempt to download and install the GCSpy server when required.</p>

<h4><a name="UsingGCSpy-BuildingJikesRVMtouseGCspy"></a>Building Jikes RVM to use GCspy</h4>

<p>To build the Jikes RVM with GCSpy support the configuration parameter <tt>config.include.gcspy</tt> must be set to <tt>1</tt> such as in the <tt>BaseBaseSemiSpaceGCspy</tt>configuration. You can also have the Jikes RVM build process create a script to start the GCSpy client tool if GCSpy was built with support for client component. To achieve this the configuration parameter <tt>config.include.gcspy-client</tt> must be set to <tt>1</tt>.</p>

<p>The following steps build the Jikes RVM with support for GCSpy on linux-ia32 platform.<br/>
<tt>$ cd $RVM_ROOT</tt><br/>
<tt>$ ant &#45;Dhost.name=ia32-linux &#45;Dconfig.name=BaseBaseSemiSpaceGCspy &#45;Dconfig.include.gcspy-client=1</tt></p>

<p>It is also possible to build the Jikes RVM with GCSpy support but link it against a fake stub implementation rather than the real GCSpy implementation. This is achieved by setting the configuration parameter <tt>config.include.gcspy-stub</tt> to <tt>1</tt>. This is used in the nightly testing process. </p>

<h4><a name="UsingGCSpy-RunningJikesRVMwithGCspy"></a>Running Jikes RVM with GCspy</h4>

<p>To start Jikes RVM with GCSpy enabled you need to specify the port the GCSpy server will listen on.<br/>
<tt>$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux</tt><br/>
<tt>$ ./rvm &#45;Xms20m &#45;X:gc:gcspyPort=3000 &#45;X:gc:gcspyWait=true &amp;</tt></p>

<p>Then you need to start the GCspy visualiser client.<br/>
<tt>$ cd $RVM_ROOT/dist/BaseBaseSemiSpaceGCspy_ia32-linux</tt><br/>
<tt>$ ./tools/gcspy/gcspy</tt></p>

<p>After this you can specify the port and host to connect to (i.e. localhost:3000) and click the "Connect" button in the bottom right-hand corner of the visualiser.</p>

<h2><a name="UsingGCSpy-Commandlinearguments"></a>Command line arguments</h2>

<p>Additional GCspy-related arguments to the <tt>rvm</tt> command:</p>
<ul>
	<li><tt>&#45;X:gc:gcspyPort=<em>&lt;port&gt;</em></tt><br/>
The number of the port on which to connect to the visualiser.  The default is port <tt>0</tt>, which signifies no connection.</li>
	<li><tt>&#45;X:gc:gcspyWait=<em>&lt;true&#124;false&gt;</em></tt><br/>
Whether Jikes RVM should wait for a visualiser to connect.</li>
	<li><tt>&#45;X:gc:gcspyTilesize=<em>&lt;size&gt;</em></tt><br/>
How many KB are represented by one tile.  The default value is 128.</li>
</ul>


<h2><a name="UsingGCSpy-WritingGCspydrivers"></a>Writing GCspy drivers</h2>

<p>To instrument a new collector with GCspy, you will probably want to subclass your  collector and to write new drivers for it.  The following sections explain the modifications you need to make and how to write a driver. You may use <tt>org.mmtk.plan.semispace.gcspy</tt> and its drivers as an example.</p>

<p>The recommended way to instrument a Jikes RVM collector with GCspy is to create a <tt>gcspy</tt> subdirectory in the directory of the collector being instrumented, e.g. <tt>MMTk/src/org/mmtk/plan/semispace/gcspy</tt>.  In that directory, we need 5 classes:</p>
<ul>
	<li><tt>SSGCspy</tt>,</li>
	<li><tt>SSGCspyCollector</tt>,</li>
	<li><tt>SSGCspyConstraints</tt></li>
	<li><tt>SSGCspyMutator</tt> and</li>
	<li><tt>SSGCspyTraceLocal</tt>.</li>
</ul>


<p><tt>SSGCspy</tt> is the plan for the instrumented collector. It is a  subclass of <tt>SS</tt>.</p>

<p><tt>SSGCspyConstraints</tt> extends <tt>SSConstraints</tt> to provide methods<tt>boolean needsLinearScan()</tt> and <tt>boolean withGCspy()</tt>, both of which return true.</p>

<p><tt>SSGCspyTraceLocal</tt> extends <tt>SSTraceLocal</tt> to override methods<tt>traceObject</tt> and <tt>willNotMove</tt>to ensure that tracing  deals properly with GCspy objects: the GCspyTraceLocal file will be similar for  any instrumented collector.</p>

<p>The instrumented collector, <tt>SSGCspyCollector</tt>, extends <tt>SSCollector</tt>. It needs to override <tt>collectionPhase</tt>.</p>

<p>Similarly, <tt>SSGCspyMutator</tt> extends <tt>SSMutator</tt> and must also override its  parent's methods<tt>collectionPhase</tt>, to allow the allocators to collect data; and its <tt>alloc</tt> and <tt>postAlloc</tt> methods to allocate GCspy objects in GCspy's heap space.</p>

<h3><a name="UsingGCSpy-ThePlan"></a>The Plan</h3>

<p><tt>SSGCspy.startGCspyServer</tt> is called immediately before the "main" method is loaded and run.  It initialises the GCspy server with the port on which to communicate, adds event names,  instantiates a driver for each space, and then starts the server, forcing the VM to wait for a GCspy to connect if necessary. This method has the following responsibilities.</p>
<ol>
	<li>Initialise the GCspy server: server.init(name, portNumber, verbose);</li>
	<li>Add each event to the <tt>ServerInterpreter</tt> (`server' for short) server.addEvent(eventID, eventName);</li>
	<li>Set some general information about the server (e.g. name of the collector, build, etc) server.setGeneralInfo(info);</li>
	<li>Create new drivers for each component to be visualised myDriver = new MyDriver(server, args...);</li>
</ol>


<p>Drivers extend <tt>AbstractDriver</tt> and register their space with the <tt>ServerInterpreter</tt>. In addition to the server, drivers will take        as arguments the name of the space, the MMTk space, the tilesize, and       whether this space is to be the main space in the visualiser.</p>

<h3><a name="UsingGCSpy-TheCollectorandMutator"></a>The Collector and Mutator</h3>

<p>Instrumenters  will typically want to add data collection points  before, during and after a collection by overriding <tt>collectionPhase</tt> in <tt>SSGCspyCollector</tt> and <tt>SSGCspyMutator</tt>.</p>

<p><tt>SSGCspyCollector</tt> deals with the data in the semi-spaces that has been allocated there (copied) by the collector. It only does any real work at the end of the collector's last tracing phase, <tt>FORWARD_FINALIZABLE</tt>.</p>

<p><tt>SSGCspyMutator</tt> is more complex: as well as gathering data for objects that it allocated in From-space at the start of the <tt>PREPARE_MUTATOR</tt> phase, it also deals with the  immortal and large object spaces.</p>

<p>At a collection point, the collector or mutator will typically</p>
<ol>
	<li>Return if the GCspy port number is 0 (as no client can be connected).</li>
	<li>Check whether the server is connected at this event. If so, the compensation timer (which discounts the time taken by GCspy to ather the data) should be started before gathering data and stopped after it.</li>
	<li>After gathering the data, have each driver  call its <tt>transmit</tt> method.</li>
	<li><tt>SSGCspyCollector</tt> does <em>not</em> call the GCspy server's <tt>serverSafepoint</tt> method, as the collector phase is usually followed by a mutator    phase. Instead, <tt>serverSafepoint</tt> can be called by <tt>SSGCspyMutator</tt> to indicate that this is a point at which the server can pause, play one event, etc.</li>
</ol>


<p>Gathering data will vary from MMTk space to space. It will typically be necessary to resize a space before gathering data. For a space,</p>
<ol>
	<li>We may need to reset the GCspy driver's data depending on the collection phase.</li>
	<li>We will pass the driver as a call-back to the allocator. The allocator will typically ask the driver to set the range of addresses from which we want to gather data, using the driver's <tt>setRange</tt> method. The driver should then iterate through its MMTk space, passing a reference to each object found to the driver's scan method.</li>
</ol>


<h3><a name="UsingGCSpy-TheDriver"></a>The Driver</h3>

<p>GCspy space drivers extend <tt>AbstractDriver</tt>. This class creates a new GCspy <tt>ServerSpace</tt> and initializes the control values for each tile in the space. <em>Control</em> values indicate whether a tile is <em>used</em>, <em>unused</em>, a <em>background</em>, a <em>separator</em> or a <em>link</em>. The constructor for a typical space driver will:</p>
<ol>
	<li>Create a GCspy <tt>Stream</tt> for each attribute of a space.</li>
	<li>Initialise the tile statistics in each stream.</li>
</ol>


<p>Some drivers may also create a <tt>LinearScan</tt> object to handle call-backs  from the VM as it sweeps the heap (see above).</p>

<p>The chief roles of a driver are to accumulate tile statistics, and to transmit the summary and control data and the data for all of their streams. Their data gathering interface is the <tt>scan</tt> method (to which an object  reference or address is passed).</p>

<p>When the collector or mutator has finished gathering data, it calls the <tt>transmit</tt> of the driver for each space that needs to send  its data. Streams may send values of types byte, short or int, implemented through classes <tt>ByteStream</tt>, <tt>ShortStream</tt> or <tt>IntStream</tt>. A driver's <tt>transmit</tt> method will typically:</p>
<ol>
	<li>Determine whether a GCspy client is connected and interested in this event, e.g. <tt>server.isConnected(event)</tt></li>
	<li>Setup the summaries for each stream, e.g. <tt>stream.setSummary(values...);</tt></li>
	<li>Setup the control information for each tile. e.g. <tt>controlValues(CONTROL_USED, start, numBlocks);</tt><br/>
      <tt>controlValues(CONTROL_UNUSED, end, remainingBlocks);</tt></li>
	<li>Set up the space information, e.g. <tt>setSpace(info);</tt></li>
	<li>Send the data for all streams, e.g. <tt>send(event, numTiles);</tt></li>
</ol>


<p>Note that <tt>AbstractDriver.send</tt> takes care of sending the information for all streams (including control data).</p>

<h3><a name="UsingGCSpy-Subspaces"></a>Subspaces</h3>

<p><tt>Subspace</tt> provides a useful abstraction of a contiguous region of a heap, recording its start and end address, the index of its first block, the size of  blocks in this space and the number of blocks in the region. In particular, <tt>Subspace</tt> provides methods to:</p>
<ul>
	<li>Determine whether an address falls within a subspace;</li>
	<li>Determine the block index of the address;</li>
	<li>Calculate how much space remains in a block after a given address;</li>
</ul>


				    
                    			    </td>
		    </tr>
	    </table>
	    <table border="0" cellpadding="0" cellspacing="0" width="100%">
			<tr>
				<td height="12" background="http://docs.codehaus.org/images/border/border_bottom.gif"><img src="images/border/spacer.gif" width="1" height="1" border="0"/></td>
			</tr>
		    <tr>
			    <td align="center"><font color="grey">Document generated by Confluence on Jul 04, 2010 19:57</font></td>
		    </tr>
	    </table>
    </body>
</html>